Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid holding onto the buffer when parsing unknown length-delimited fields #863

Merged
merged 5 commits into from
Aug 2, 2023

Conversation

osa1
Copy link
Member

@osa1 osa1 commented Jul 29, 2023

Currently unknown field set parser stores length-delimited fields as views of the input buffer. This has a few issues:

  • It's inconsistent with the known-field parsing where we copy bytes fields.

  • A single unknown length-delimited field can cause keeping a large input buffer alive.

  • Because the caller is free to modify the input buffer, this implementation does not allow freezing an unknown field set.

    (This can also be fixed by copying the length-delimited fields when freezing.)

  • Even when the parsed message is not frozen, this aliasing can cause bugs as it's not documented. Even if we document it, it would probably be a footgun.

    This can cause segfaults or worse when the input buffer is passed from e.g. C++ or C as a Uint8List, and the caller frees the buffer after parsing while the message is still in use.

This PR makes readBytes copy the bytes before returning to avoid aliasing. A new member readBytesAsView added with the previous behavior. Unknown length-delimited field parsing fixed with the new readBytes.

Sync CL: cl/552077275

…ields

Currently unknown field set parser stores length-delimited fields as
views of the input buffer. This has a few issues:

- It's inconsistent with the known-field parsing where we copy `bytes`
  fields.

- A single unknown length-delimited field can cause keeping a large
  input buffer alive.

- Because the caller is free to modify the input buffer, this
  implementation does not allow freezing an unknown field set.

  (This can also be fixed by copying the length-delimited fields when
  freezing.)

- Even when the parsed message is not frozen this aliasing can cause
  bugs as it's not documented. Even if we documented it, it would
  probably be a footgun.

- Most importantly, this can cause segfaults or worse when the input
  buffer is passed from e.g. C++ or C as a `Uint8List`, and the caller
  frees the buffer after parsing while the message is still in use.

This PR updates length-delimited unknown field parsing to copy the bytes
before storing the field in `UnknownFieldSet`.
@osa1 osa1 requested a review from mkustermann July 29, 2023 05:36
Copy link
Collaborator

@mkustermann mkustermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @osa1

protobuf/lib/src/protobuf/coded_buffer_reader.dart Outdated Show resolved Hide resolved
protobuf/CHANGELOG.md Outdated Show resolved Hide resolved
@osa1 osa1 merged commit 217c030 into master Aug 2, 2023
18 checks passed
@osa1 osa1 deleted the osa1/unknown_bytes_parsing branch August 2, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants